Large-Scale Cross-Language Web Page Classification via Dual Knowledge Transfer Using Fast Nonnegative Matrix Trifactorization
نویسندگان
چکیده
منابع مشابه
Large-Scale Web Page Classification
This research investigates the design of a unified framework for the content-based classification of highly imbalanced hierarchical datasets, such as web directories. In an imbalanced dataset, the prior probability distribution of a category indicates the presence or absence of class imbalance. This may include the lack of positive training instances (rarity) or an overabundance of positive ins...
متن کاملFast Nonnegative Matrix Factorization Algorithms Using Projected Gradient Approaches for Large-Scale Problems
Recently, a considerable growth of interest in projected gradient (PG) methods has been observed due to their high efficiency in solving large-scale convex minimization problems subject to linear constraints. Since the minimization problems underlying nonnegative matrix factorization (NMF) of large matrices well matches this class of minimization problems, we investigate and test some recent PG...
متن کاملWeb Page Classification using Iterative Cross-Training Algorithm
The paper presents a generalization of Iterative Cross-Training algorithm (ICT) which was previously applied to Thai Web pages identification [1]. The main concept of ICT is to iteratively train two sub-classifiers by using unlabeled examples in crossing manner. In this paper, we extend the algorithm in order to classify Web pages into course or non-course ones, which is a more challenging prob...
متن کاملPage Digest for Large-Scale Web Services
The rapid growth of the World Wide Web and the Internet has fueled interest in Web services and the Semantic Web, which are quickly becoming important parts of modern electronic commerce systems. An interesting segment of the Web services domain are the facilities for document manipulation including Web search, information monitoring, data extraction, and page comparison. These services are bui...
متن کاملFast Nonnegative Matrix Tri-Factorization for Large-Scale Data Co-Clustering
NonnegativeMatrix Factorization (NMF) based coclustering methods have attracted increasing attention in recent years because of their mathematical elegance and encouraging empirical results. However, the algorithms to solve NMF problems usually involve intensive matrix multiplications, which make them computationally inefficient. In this paper, instead of constraining the factor matrices of NMF...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Knowledge Discovery from Data
سال: 2015
ISSN: 1556-4681,1556-472X
DOI: 10.1145/2710021